A Composite Likelihood View for Multi-Label Classification
نویسندگان
چکیده
Given limited training samples, learning to classify multiple labels is challenging. Problem decomposition [24] is widely used in this case, where the original problem is decomposed into a set of easier-to-learn subproblems, and predictions from subproblems are combined to make the final decision. In this paper we show the connection between composite likelihoods [17] and many multilabel decomposition methods, e.g., one-vs-all, one-vs-one, calibrated label ranking, probabilistic classifier chain. This connection holds promise for improving problem decomposition in both the choice of subproblems and the combination of subproblem decisions. As an attempt to exploit this connection, we design a composite marginal method that improves pairwise decomposition. Pairwise label comparisons, which seem to be a natural choice for subproblems, are replaced by bivariate label densities, which are more informative and natural components in a composite likelihood. For combining subproblem decisions, we propose a new mean-field approximation that minimizes the notion of composite divergence and is potentially more robust to inaccurate estimations in subproblems. Empirical studies on five data sets show that, given limited training samples, the proposed method outperforms many alternatives.
منابع مشابه
Exploiting Associations between Class Labels in Multi-label Classification
Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...
متن کاملInformation Theoretic Feature Selection in Multi-label Data through Composite Likelihood
In this paper we present a framework to unify information theoretic feature selection criteria for multi-label data. Our framework combines two different ideas; expressing multi-label decomposition methods as composite likelihoods and then showing how feature selection criteria can be derived by maximizing these likelihood expressions. Many existing criteria, until now proposed as heuristics, c...
متن کاملMLIFT: Enhancing Multi-label Classifier with Ensemble Feature Selection
Multi-label classification has gained significant attention during recent years, due to the increasing number of modern applications associated with multi-label data. Despite its short life, different approaches have been presented to solve the task of multi-label classification. LIFT is a multi-label classifier which utilizes a new strategy to multi-label learning by leveraging label-specific ...
متن کاملLearning with Limited Supervision by Input and Output Coding
In many real-world applications of supervised learning, only a limited number of labeled examples are available because the cost of obtaining high-quality examples is high or the prediction task is very specific. Even with a relatively large number of labeled examples, the learning problem may still suffer from limited supervision as the dimensionality of the input space or the complexity of th...
متن کاملText classification with sparse composite document vectors
In this work, we present a modified feature formation technique gradedweighted Bag of Word Vectors (gwBoWV) by (Vivek Gupta, 2016) for faster and better composite document feature representation. We propose a very simple feature construction algorithm that potentially overcomes many weaknesses in current distributional vector representations and other composite document representation methods w...
متن کامل